Matrix Calculus

An important foundation for optimal control is matrix calculus. In this section, we will pick a standard convention known as numerator layout and collect some rules to use. In general, if a quantity is in the numerator it is a scalar, column vector, or matrix. If it is in the denominator it is a transposed scalar, column vector, or matrix.

$ \newcommand{\pd}[2]{\frac{\partial#1}{\partial#2}} \newcommand{\B}[1]{\mathbf{#1}} \newcommand{\A}{\B{A}} \newcommand{\x}{\B{x}} \newcommand{\u}{\B{u}} $ Click this cell to see defined latex functions.

Scalar derivative w.r.t. column vector (nx1):

$\pd{y}{\B{x}} = \begin{bmatrix} \pd{y}{x_1} & \pd{y}{x_2} & \ldots & \pd{y}{x_n} \end{bmatrix}$

Column vector $\B{y}$ (mx1) derivative w.r.t. scalar $x$:

$\pd{\B{y}}{x} = \begin{bmatrix} \pd{y_1}{x} \\ \pd{y_2}{x} \\ \vdots \\ \pd{y_m}{x} \end{bmatrix}$

Column vector $\B{y}$ (mx1) derivative w.r.t. column vector $\mathbf{x}$, (nx1):

$\pd{\B{y}}{\B{x}} = \begin{bmatrix} \pd{y_1}{x_1} & \pd{y_1}{x_2} & \ldots & \pd{y_1}{x_n} \\ \pd{y_2}{x_1} & \pd{y_2}{x_2} & \ldots & \pd{y_2}{x_n} \\ \vdots & \vdots & \ddots & \vdots \\ \pd{y_m}{x_1} & \pd{y_m}{x_2} & \ldots & \pd{y_m}{x_n} \\ \end{bmatrix}$

Scalar y derivative w.r.t. matrix $\B{X}$, (pxq), note the transpose since $\B{X}$ is in the denominator:

$\pd{y}{\B{X}} = \begin{bmatrix} \pd{y}{x_{11}} & \pd{y}{x_{21}} & \ldots & \pd{y}{x_{p1}} \\ \pd{y}{x_{12}} & \pd{y_2}{x_{22}} & \ldots & \pd{y}{x_{p2}} \\ \vdots & \vdots & \ddots & \vdots \\ \pd{y}{x_{1q}} & \pd{y}{x_{2q}} & \ldots & \pd{y}{x_{pq}} \\ \end{bmatrix}$

Matrix $\B{Y} (pxq)$ derivative w.r.t. scalar $x$:

$\pd{\mathbf{Y}}{x} = \begin{bmatrix} \pd{y_{11}}{x} & \pd{y_{12}}{x} & \ldots & \pd{y_{1q}}{x} \\ \pd{y_{21}}{x} & \pd{y_{22}}{x} & \ldots & \pd{y_{2q}}{x} \\ \vdots & \vdots & \ddots & \vdots \\ \pd{y_{p1}}{x} & \pd{y_{p2}}{x} & \ldots & \pd{y_{pq}}{x} \\ \end{bmatrix}$

Matrix differential $d\B{X}$ (pxq):

$d\mathbf{X} = \begin{bmatrix} dx_{11} & dx_{12} & \ldots & dx_{1q} \\ dx_{21} & dx_{22} & \ldots & dx_{2q} \\ \vdots & \vdots & \ddots & \vdots \\ dx_{p1} & dx_{p2} & \ldots & dx_{pq} \end{bmatrix}$

Vector by Vector Identities

$ \pd{\A\x}{\x} = \A \\ \pd{\x^T\A}{\x} = \A^T \\ \pd{\A\u(\x)}{\x} = \A\pd{\u(\x)}{\x} \\ $

Scalar by Vector Identities

$ \pd{\x^T\A\x}{\x} = \x^T( \A + \A^T) \\ \pd{\B{a}^T\x}{\x} = \pd{\x^T \B{a}}{\x} = \B{a}^T \\ $

References